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Summary 

Suppose we are interested in the mean of an outcome variable missing not at random. Suppose 
however that one has available a fully observed shadow variable, which is associated with the 
outcome but independent of the missingness process conditional on covariates and the possibly 
unobserved outcome. Such a variable may be a proxy or a mismeasured version of the outcome 
available for all individuals. We have previously established necessary and sufficient conditions 
for identification of the full data law in such a setting, and have described semiparametric estima¬ 
tors including a doubly robust estimator of the outcome mean. Here, we propose two alternative 
doubly robust estimators for the outcome mean, which may be viewed as extensions of analogous 
methods under missingness at random, but enjoy different properties. We assess correctness of 
the required working models via straightforward goodness-of-fit tests. 

Some key words'. Doubly robust estimation; Missingness not at random; Shadow variable. 


1. Introduction 

Doubly robust methods are designed to mitigate estimation bias due to model misspecification 
in observational studies and imperfect experiments. Such methods have grown in popularity in 
recent years for estimation with missing data and other forms of coarsening (Robins et ah, 1994; 
Scharfstein et ah, 1999; Van der Laan & Robins, 2003; Bang & Robins, 2005; Tsiatis, 2006). 
There exist various constructions of doubly robust estimators for the mean of an outcome that is 
missing at random; see Kang & Schafer (2007). In contrast, for data missing not at random, dif¬ 
ficulty of identification undermines one’s ability to obtain accurate inferences, and doubly robust 
estimation is far more challenging. Identification of a full data model means that, the parameters 
indexing the model are uniquely determined by the observed data, i.e., the data that are actu¬ 
ally observed on the individuals. Statistical inference based on non-identifiable models may be 
misleading and of limited interest in practice; see Miao et al. (2015). Under missingness at ran¬ 
dom, the full data law, i.e., the joint distribution of all variables of interest, is nonparametrically 
identified from the observed data. However, under missingness not at random, identification is 
no longer possible without further restrictions on the missingness process. Although no general 
identification results are available for data missing not at random, one may identify the full data 
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law under specific assumptions. Building on earlier work by D’Haultfoeuille (2010), Wang et al. 
(2014) and Zhao & Shao (2014), Miao et al. (2015) used a fully observed shadow variable to 
establish a general identification framework for data missing not at random. Such a variable is 
associated with the outcome conditional on covariates, but independent of the missingness condi¬ 
tional on covariates and the outcome (Kott, 2014); it may be available in many empirical studies, 
where a fully observed proxy or a mismeasured version of the outcome is available. For example, 
in a study of mental health of children in Connecticut (Zahner et al., 1992; Ibrahim et al., 2001), 
researchers were interested in evaluating the prevalence of students with abnormal psychopatho- 
logical status based on their teacher’s assessment, which was subject to missingness. A separate 
parent report available for all children in the study, is a proxy for the teacher’s assessment, but is 
unlikely to be related to the teacher’s response rate conditional on covariates and her assessment 
of the student; in this case the parental assessment constitutes a valid shadow variable. Other 
examples can be found in Miao et al. (2015) and Wang et al. (2014). 

Throughout, we let Y denote the outcome, R is its missingness indicator with 22 = 1 if y is 
observed, otherwise 22 = 0, and let X denote fully observed covariates. Suppose that one has 
also fully observed a shadow variable Z that satisfies 

Assumption 1. (i)Z ^Y\X', (ii) Z JL 22 | {Y, X). 

Assumption 1 formalizes the idea that, the shadow variable only affects the missingness through 
its association with the outcome. We provide a directed acyclic graph in the Supplementary Ma¬ 
terial that can help to understand the assumption. The shadow variable introduces additional con¬ 
ditional independence conditions, which impose further restrictions on the missingness process, 
and thus provides better opportunity for identification despite the fact that data may be missing 
not at random. Miao et al. (2015) presented a brief review of such problems, and gave necessary 
and sufficient conditions as well as sufficient conditions for identification with a shadow variable. 
In particular, if the outcome is binary, the full data law is identifiable with a binary shadow vari¬ 
able. But for a continuous outcome, a binary shadow variable does not impose enough restrictions 
to identify the full data law; see the Supplement Material for a counterexample. Identification for 
a continuous outcome requires at least one continuous shadow variable, but even then, additional 
conditions are needed. We consider a location-scale model for the density function: 



( 1 ) 


r = 0,1 


with unrestricted functions and ar, and density functions fr- Under certain regularity con¬ 
ditions summarized in the Appendix, we have previously proved identification of the full data 
law if either f{y | x, r = 1) or f{y \ x, z,r = 0) follows model (1), even if the missingness 
process is unrestricted (Miao et al., 2015). Aside for Assumption 1, model (1) includes many 
commonly-used models, for instance, Gaussian models, and thus essentially demonstrates that 
lack of identification is not an issue in many familiar situations. However, one cannot understate 
the central role of the shadow variable for identification. Without such a variable, identification is 
no longer guaranteed for model (1), even if one were to assume a parametric missingness model. 
For additional and extensive discussion about identification under missingness not at random 
with a shadow variable, see Miao et al. (2015) and Wang et al. (2014). 

With models satisfying the corresponding identification conditions, previous authors have de¬ 
veloped several non-doubly robust estimators. Among them, inverse probability weighted esti¬ 
mation (Wang et al., 2014) and pseudo-likelihood estimation (Zhao & Shao, 2014) are sensitive 
to model misspecitication; and nonparametric estimation (D’Haultfoeuille, 2010) requires an un¬ 
realistic large sample size for reasonable performance when the covariate dimension is moderate 
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to large. In contrast, a doubly robust approach remains consistent and asymptotically normal un¬ 
der partial misspecification. Specifically, Miao et al. (2015) developed a doubly robust estimator 
based on a three-part model for the full data: a model for the joint distribution of the outcome and 
the shadow variable in complete cases; a model for the propensity score evaluated at a reference 
value of the outcome; and a log odds ratio model encoding the association of the outcome and 
the missingness process. Under correct specification of the log odds ratio model, the doubly ro¬ 
bust estimator is consistent if either of the other two models is correct, but not necessarily both. 
However, the construction of a doubly robust estimator is not unique. In this paper, we develop 
two alternative doubly robust estimators of the outcome mean that enjoy different properties, and 
we compare them both in theory and via simulations reported in the Supplementary Material. 


2. Doubly robust estimators 

Under Assumption 1, we factorize the conditional density function of (Z, Y, R) given X as 
fiz, y,r\x) = c{x) exp{(l - r)OR{y \ 3 :)}pr(r \ y = 0, x)f{z, y\r = l,x), (2) 


where c(x) = pr(r = 1 | x)/pr(r = 1 | y = 0 ,x); pr(r = 1 | y = 0 ,x) is the response proba¬ 
bility evaluated at the reference level y = 0 , and is referred to as the baseline propensity score; 
f{z, y I r = 1, x) is the joint density function of (Z, Y) conditional on X among the complete 
cases, i.e., the subset with r = 1 , and is referred to as the baseline outcome density; 


OR(y I x) 


log 


pr(r 

pr(r 


0 I y, x)pr(r = 1 | y = 0 , x) 
0 I y = 0 , x)pr(r = 1 | y, x) ’ 


is the log of the conditional odds ratio function relating Y and R given X with £'[exp{OR(y | 
x)} I r = 1, x] < oo and OR(y = 0 | x) = 0. For a continuous outcome, we require that f{z, y \ 
r = l,x) satisfies model (1) fo guarantee identificafion. For estimation, we specify separate 
parametric models pr(r = 1 | y = 0, x; a), f{z, y | r = 1, x; /3), and OR(y | x; 7 ). We suppose 
throughout that OR(y | x; 7 ) is correctly specified, which can be achieved by specifying a rela¬ 
tively flexible model, or following the approach suggested by Higgins et al. (2008) if information 
on the reasons for missingness are available. From (2), we have the following identities: 


pr(r = 1 I y,x) = 
/(z,y I r = 0 ,x) = 
E{y I r = 0, x) = 


pr(r = 1 I y = 0, x) 

pr(r = 1 I y = 0, x) -|- exp{OR(y | x)}{pr(r = 0 | y = 0, x)} ’ 

exp{OR(y | x)} ^,1 ^ ^ 

— - I I -^— -J{z,y r = l,x), 

ii/[exp|OR(y | x)| | r = l,xj 

£^[exp{OR(y | x)}y | r = l,x] 

i?[exp{OR(y | x)} | r = l,x] 


( 3 ) 

( 4 ) 

( 5 ) 


The propensity score, and its reciprocal, i.e., the inverse probability weight function 
VU(x, y; a, 7 ) = 1 /pr(r = 1 | x, y; a, 7 ), are determined by the baseline propensity score model 
pr(r = 1 I X, y = 0; a) and the log odds ratio model OR(y | x; 7 ) as in (3); the conditional out¬ 
come mean among the incomplete cases E{y | r = 0, x; /3, 7 ) is determined by the baseline out¬ 
come model and the log odds ratio model as in (5). 

Estimation of /3 only involves the complete cases. Let E denote the empirical mean, we solve 


E{rS{z,y,x-,^)} = 0, 


( 6 ) 


with score function S{z,y,x-,I3) = dlog{P{z,y \ r = 1,x;/3)}/c)/3. Estimation of a and 7 is 
motivated from a classic estimating equation following the fact that the respective weighted 
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mean of any vector functions G{x,y) and H{x) among the complete cases equals their popu¬ 
lation mean: E[{W{x,y]a,^)r — l}{G{x,y)'^,H{x)'^}'^] = 0, where G{x,y) and H{x) are 
user-specified vector functions of dimension equal to that of 7 and a, respectively, and satisfy 
E[dW{x,y; a,^)r/d{a,^){G{x, z)'^, H{x)'^}] is nonsingular for all (a, 7 ). For example, if 
pr(r = l|y,x;a, 7 ) follows a logistic model and thus W{x,y;a,'y) = 1-|-exp{—(1, — 

jy}, we may naturally choose G{x,y) = y and H{x) = Because y is missing for 

r = 0, the classic estimating equation is not feasible. However, Assumption 1 allows us to re¬ 
place y with the shadow variable 2 ; and to replace G{x, y) with G(x, z). To further derive doubly 
robust estimators, we incorporate the baseline outcome model into the estimating equation for 
(a, 7 ). Let G'i(x, z] f3,'y) = G{x, z) — E{G{x, z)\r = 0, x; (3,^}, we solve 

E[{W{x,y;a,j)r - l}{Gi{x,z;'^,jf,H{xf}^] = 0, (7) 

with G{x, z) and H{x) such that E[dW{x, y; a, 'y)r/d{a, 7 ){Gi (x, z; / 3 , 7 )^, 77(x)^}] is non¬ 
singular for all (a, /3, 7 ). The shadow variable Z is used as a proxy of Y, thus, a choice of Z that 
is highly correlated with Y is desirable for the purpose of efficiency maximization. 

Using (a, /3, 7 ) obtained from equations ( 6 ) and (7), we construct three different estimators 
for the outcome mean that are consistent if either the baseline outcome model or the baseline 
propensity score model is correctly specified, fogefher wifh fhe log odds rafio model. 

A regression esfimafor wifh residual bias correcfion was previously described by Miao el al. 
(2015). We use fhe weighted residual lo correcf fhe bias of fhe conditional mean among incom¬ 
plete cases. Lef Mq{x] /3, 7 ) = E{y | r = 0, x; /3, 7 ), fhe estimator is 

fli = E[W{x,y;a,^)r{y - Mo{x](3,^)} + Mo{x](3,^)]. 


A Horvifz-Thompson esfimafor wifh extended weights employs an extended baseline propen¬ 
sity score model and an extended weight function. The extended baseline propensity score model 
with unknown parameter cf) satisfies prgxt(x = 1 | y = 0 , x; i;f)) = pr(r = 1 | y = 0 , x; a) only af 
(/) = 0. For example, we can specify 


Pl'extC’’ = 1 I y = 0 ,x;(/>) 


pr(r = 1 I y = 0 , x; 3) 

pr(r = 1 I y = 0 , x; a) -|- exp{(/)y(x)}pr(r = 0 | y = 0 , x; 3) ’ 


wifh user-specified scalar function y(x). The extended weighl function mext(x,y;3)> ifs 
reciprocal is determined as in (3) wifh OR{y\x) and pr(r = 1 | y = 0, x) replaced by OR{y \ 
x; 7 ) and prg^^(r = 1 | y = 0, x; 3) respectively. We estimate 3 by solving 


E[{We^t{x,y,^)r - 1}{Mo(x; 3,7) - ^reg}] = 0, (8) 

wifh previously obfained (3,7) and yj-gg = i7{(l — r)Mo(x; 3, 7 ) + xy}. The Horvifz- 
Thompson esfimafor wifh extended weighls is 



^ext(x,y;3)r 1 

-E{Wext(x,y;3)x}^J 


A regression esfimafor wifh an extended outcome model involves an extended outcome 
model Moext(x;3’) with parameter ip satisfying Moext(x;'3) = Mo(x;3,7) only at i/' = 0 . If 
ATo(x;3,7) = -^{Q(x; 3,7)} for some inverse link A and some function Q, we can specify 
AToext(x; Ip) = X{Q{x; (3, 7 ) -h 'ipq{x)} with a scalar function q{x). We estimate ip by solving 


E[{W{x,y]a,j) - l}r{y - Moext{x] ip)}] = 0 , 


( 9 ) 
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with previously obtained (q, 7). The regression estimator with an extended outcome model is 

/ra = E{{1 - r)Moext(a^; + ry}. 

The estimators /Ii, ^2 and ^3 may have very different characteristics, although, all three estima¬ 
tors are doubly robust. 

Theorem 1. Under Assumption 1, if the log odds ratio model OR(y | X ‘,' y ) is correct, and 
the probability limit of equations (6), (7), (8) and (9) has a unique solution, then ' pi , p ,2 and ^3 
are consistent if either f{z, y \ r = l,x\ jd) or pr(r = 1 | y = 0 , x; a) is correctly specified. 

The extended models not only provide double robustness, but also provide a strategy to check 
if the working models are correct. We prove in the Appendix that if the baseline propensity 
score model is correct, f converges to 0 in probability; and if the baseline outcome model is 
correct, ijj converges to 0 in probability. Therefore, one may use this property to assess whether 
the working models are correctly specified by checking whether f and are within sampling 
variability of zero, respectively. However, one should acknowledge that the space of possible 
departures from the assumed model may be prohibitively large relative to the proposed test so 
that the resulting goodness-of-fit test will generally have good power against certain alternatives 
but not in all possible directions away from the specified working model. We explore fhe power 
of fhe proposed goodness-of-fif test via a simulation study in the Supplementary Material. 

All three doubly robust estimators rely on a correct log odds ratio model, since inference about 
the law of Y requires an accurate evaluation of the dependence between the missingness process 
and the outcome, which is captured by the log odds ratio model OR(y | x;^). To the best of 
our knowledge, with the exception of Miao et al. (2015), previous doubly robust estimators have 
assumed that this log odds ratio is known, either to equal the null value of 0 under missingness 
at random (Bang & Robins, 2005; Tsiatis, 2006; Van der Laan & Robins, 2003), or to be of a 
known functional form with no unknown parameters (Vansteelandt et ah, 2007; Robins et ah, 
2008). We have relaxed these more stringent assumptions. 


3. Relation to previous doubly robust estimators and comparisons 

Previous doubly robust estimators under missingness at random can be viewed as special 
cases of our estimators. Under missingness at random, OR(y | x) = 0, pr(r = 1 | x, y = 0) = 
pr(r = 1 I x), the inverse probability weight function lU(x; a) = l/pr(r = 1 | x; a) does not 
vary with y, and the conditional mean among the population M{x;/3) equals that among the 
incomplete cases Mo(x; jd, 7). The estimator p[ = E[W{x; a)r{y — M(x; fd)} + M(x; /?)] of 
Kang & Schafer (2007) is a special case of the regression estimator with residual bias correc¬ 
tion; the estimator = E[Wextix; (t>)r/E{Wextix; (j))r}y] proposed by Robins et al. (2007), 
with an extended logistic propensity score model logit prgxt(T = ^ \ x; f) = ( 1 , x'^)a + 4>g{x), 
is a special case of the Horvitz-Thompson estimator with extended weights; the estimator 
^3 = £'{Mext(x; V^)} proposed by Robins et al. (2007), with an extended outcome model 
Mextix;ip) satisfying E[W{x-,a)r{y - Mext(®;^)}] = 0 and E[r{y - Mext{x;i^)}] = 0, is a 
special case of the regression estimator with an extended outcome model. 

The three proposed doubly robust estimators enjoy some of the properties of their missingness 
at random analogs. The estimator p 2 is a convex combination of the observed outcome values. It 
satisfies fhe boundedness properfy (Robins ef ah, 2007) fhaf fhe esfimafor falls in fhe parameter 
space for the outcome mean almost surely. Such estimators are preferred when the inverse prob¬ 
ability weights are highly variable, because they rule out estimates outside the sample space. 
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Boundedness is not guaranteed for ^i. If the range of Moext(®;i/’) is contained in the sample 
space of the outcome, also satisfies the boundedness condition, but this does not hold in gen¬ 
eral. For example, if the outcome is continuous, and ip) = Mq{x] + ^|J, the range 

of /xs may be outside the sample space of the outcome mean. 

The three proposed estimators offer certain improvements in term of bias when both models 
are misspecified. The asymptotic bias of can be written as 

Biasi = E[{W{x, y-a*,j*)r - l}{y - M^ix; ^, 7 *)}], 

and the asymptotic bias of ^3 has the same form with Mo(x; 13, j*) replaced by ip*), 

with probability limits (a* , p3*,^*,ip*) of the corresponding estimators. The bias is driven by 
the degree of misspecification of both the weight function and the conditional mean among the 
incomplete cases. As pointed out by Robins et al. (2007) and Vermeulen & Vansteelandt (2014), 
without further restrictions on the inverse probability weights, Biasi gets inflated in regions 
with large weights. However, if the components of H{x) in equation (7) include a constant 
function, then E{W{x, y; a*, 7 *)r} = 1, which restricts the amount of variability of the inverse 
probability weights. Thus, Biasi does not explode with large weights. 

In simulation studies, we found that the three doubly robust estimators approximate the true 
outcome mean if either of the baseline models is correct, but they are biased if neither baseline 
model is correct. For the case with moderately variable weights, the relative magnitude of the bias 
depends on the specific data generating process, but for the case with highly-variable weights, the 
Horvitz-Thompson estimator with extended weights has smaller bias. If the baseline outcome 
model is correct, the parameter of the extended outcome model, ip is close to 0 ; and if the baseline 
propensity score model is correct, the parameter of the extended weight model, cp is close to 0 . 
We also perform formal tests of the null hypotheses Ho : (p = 0 and Hq : ip = 0 respectively 
under level 0.05. The results show an empirical type I error approximating 0.05 if the required 
baseline propensity score model or baseline outcome model is correct, respectively (i.e., the true 
value of (p and ip equals 0 respectively). Such tests have good power in moderate samples if the 
required model is incorrect, respectively. We recommend the proposed hypothesis tests to check 
for severe misspecification of the baseline models in practice. 

4. Discussion 

Extensions of the doubly robust methods described in this work to other functionals, such as 
a parameter 5 solving a full data estimating equation E{U{z, y, x; (5)} = 0, can be achieved by 
replacing Y with U wherever Y occurs in the estimating equations and solving the doubly robust 
estimating equation for the parameter of interest. The methods also have potential application in 
related areas, such as longitudinal data analysis and causal inference. 
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Appendix 
Proof of Theorem 1 

We need the following lemma, which we prove in the Supplementary Material. 

Lemma A1. Under Assumption 1, suppose that the log odds ratio model is correct, and that the 
probability limit of equations ( 6 ) and (7) has a unique solution. For any square integrable vector function 
D(z, y, x), scalar function V (x), and {a, (3, 7 ) solving equations ( 6 ) and (7), 

(i) ;/pr(r = 1 \ y = 0,x;a) is correct, then E[{W (x, y, a,j)r — 1 }£)( 2 , y, x)] converges to 0 in proba¬ 
bility; 

(ii) if f{z, ?/ I r = 1, x; /3) is correct, then E[r exp{OR(j/ | x; 7 )}^ (x){D(z, y, x) — E[D[z, y,x) 17 - = 
0, x; (3, 7 ]}] converges to 0 in probability; 

(Hi) if either of the baseline models is correct, then E[{W{x,y\a,y)r — l'\{D{z,y,x) — E[D{z,y,x) \ 
r = 0, x; /3, 7 ]}] converges to 0 in probability. 

Proof of Theorem 1. Suppose that the log odds ratio model is correctly specified, and that the proba¬ 
bility limit of the estimating equations has a unique solution. 

1. Double robustness of pi. If either of the baseline models is correct, from (hi) of Lemma 1, 
E[{W{x, y; a,j)r — l}{y — E{y | r = 0, x; /?, 7 )}] converges to 0, therefore E\W{x, y; a,y)r{y — 
Mq{x; j3, 7 )} + Mo(x; /?, 7 )] converges to the true outcome mean. 

2. Double robustness of p^. From (i) of Lemma 1, if the baseline propensity score model is cor¬ 

rect, E[{We^t{.x,y]4> = 0)r - l}{Mo(x;^, 7 ) - /Jreg}] = E[{W{x,y,a,^)r - l}{Mo(x;^, 7 ) - 
/Ireg}] converges to 0 , i.e., f = Q is a solution of the probability limit of equation ( 8 ). 
Thus, the solution of equation ( 8 ) f converges to 0, and lim„^+oo 7?{ILext(*, y; ^)p} = 1, 
lim„^.+oo£^{Wext(a;,y;^)py} = E{W{x,y\a,y)ry} = E(Y). If the baseline outcome 

model is correct, E[{\ — r){y — Mo(x; ^ 8 , 7 )}] converges to 0; preg = E[{1 — r)MQ{x; j3,y) -f ry] 
converges to the true outcome mean; and E{y — preg) converges to 0. By definition of the ex¬ 
tended weight function, {Wext{x,y;(j)) — l}r = rexp{OR(y | x; 7 )}R(x) with V{x) = prgxj(r = 
0 I y =JI,x;^)/prg^t(r = 1 I y = 0,x;(^). From Oi) of Lemma 1, E^We^t{x,y,(i)) - l}r{y- 
Mo(x; (3, 7 )}] converges to 0. Thus, E[{Wext{x, y; 4>)r — l}{y — Mo(x; ^ 8 , 7 )}] converges to 0, and 

P 2 = l/i^{Wext(a;,y;^)x} • i7[{Wext(a;, y; ^)r - l}{y - Mo(x;^, 7 )}] 

+l/E{We^ti,x,y]f)r} ■ i^[{Wext(a:, y; ^)r - l}{Mo(x;^, 7 ) - Preg}] 
3-l/E{Wextix, y; f)r} ■ E{y - p^eg) + Preg □ 

converges to the true outcome mean in probability. 

3. Double robustness of pg. If pr(r = 1 | x, y = 0; a) is correct, from (i) of Lemma 1, 
E[{W{x, y; a, j)r — l}{y — Moext(3:; ^)}] converges to 0. Note equation (9), we have that E\{1 — 
r){j/— Moext(a:^;'^)}] converges to 0. Thus, pg = £^{(1 — r)Moext(a^; ^) + py} converges to the 
true outcome mean. If /(z, y | r = 1, x;/3) is correct, then i7[(l —'r){y — Mo(x;/3, 7 )}] con¬ 
verges to 0. Since {W{x,y,a,y) — l}r = rexp{OR(y | x; 7 )}R(x) with V{x) = pr(r = 0 | y = 
0, x; a)/pr(r = l|y = 0, x; a), from (ii) of Lemma 1, i7[{IL(x,y;a,7) — l}r{y — Moext(a;; V'= 
0)}] = E[{W{x,y;a,^) — l}r{y — Mo{x-,l3,j)}] converges to 0. That is, -0 = 0 is a solution of 
the probability limit of equation (9). Thus, the solution of equation (9), "0 converges to 0, and 
lim„^+oo E{{1 - r)Moext(a:; V') + ry} = lim„^+oo E{{1 - r)Mo{x\P,y) -Pry) = E{Y). 

Regularity conditions for model (1) 

The full data law is identifiable if either /(y| 2 ;, x, r = 1) or f{y\z, x, r = 0) follows the location-scale 
model ( 1 ), and the corresponding density function fr=i or /r-=o satisfies the following conditions: 
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(a) the characteristic function (p{t) of the density function f{v) satisfies 0 < \‘p{t)\ < C'exp(—(5|f|) for 
f G R and some constants C,S > 0; 

(b) conditional on x, ^{z, x), a{z, x) are continuously differentiable and integrable with respect to z; f{v) 
is continuously differentiable, and f_^ |u • df (v)/dv is finite; 

(c) there exist some linear one-to-one mapping M : f{{v — a)/b} i —> h{t, a, b) and some value —oo < 
fo < -fcx) such that limi_i,tu /i(f, a, 6)/ft, (f, o', 6') either equals zero or infinity for any a, a' G K., 6, ft' > 
0 with (a, ft) ^ (o', ft'). 

Many commonly-used models satisfy conditions (a)-(c), for example, the Gaussian models with / the 
standard normal density function, M the inverse Laplace transform, h{t, a, ft) the moment-generating 
function of a normal density function with mean a and variance ft^, and to = -|-cxd. 
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